The objective of this module is to discuss the application and function of animal movement models, specifically, Contiuous-Time Movement Models (ctmm), and explore measures of overlap.
You are at the bar, and you just got you and your friends a third round of drinks. You turn around and the dance floor is packed like sardines, and your friends are at the very front next to the DJ booth. How do you get from where you are to where you’re going? Maybe you take a step to the left, two steps forward, another back and to the right. Each step you take is independent from the last, as the dance floor is moving, grooving and no one is staying still. This goes on (and on and on) until you have made your way back to your friends.
Some movement ecologists would consider this navigation an example of a random walk model. Random walk models assume that in a time period, an individual will take random and independent steps that are identically distributed in size and away from their previous position (Nau 2014). However, these steps do depend on the location of your previous step, with each step the starting point of your next.. that is - unless you’ve figured out teleportation.
Okay. Now, you’ve made it home from your run and you want to see the data on your fitness tracker app. But, OH NO! You must have accidentally turned it on last night when you were leaving the bar. It tracked your uber ride home, when you got out of bed to get a glass of water, when you laid on the bathroom floor for a while, and then the run. That is a lot of movement data!
This, like any GPS monitoring system, is an example of continuous-time stochastic process modeling. That is, this is a model that accounts for finely sampled and continuous modern data collection (Calabrese et al., 2016).
CRW was so focused on the sampling schedule that it neglected continuous autocorrelated movement. CTSP models, on the other hand, are able to separate these factors while taking both into consideration. CTSP models are able to accommodate a range of data characteristics including irregular sampling schedules and varying levels of autocorrelation. However, CTSP modeling can be statistically complex and lack of comprehensive software to process this kind of data made it relatively inaccessible.
Fortunately, the continuous-time movement modeling (ctmm) package for R was created. ctmm allows a broad array of movement related data analyses including diagnostic data visualization and model fitting. These parameters can be used to analyze various metrics related to animal movement. In this module, we will examine evidence of range residency, calculate home range size with confidence intervals, and visualize the overlap between the home ranges of two coyotes.
As shown in the background literature, advancements in movement modelling have allowed for more accurate home range estimates that account for both the time dimension and autocorrelation in global position system (GPS) data. Below is an example of the implementation of the R package {ctmm} (Contnuous-Time Movement Modeling) to calculate the autocorrelated kernel density estimates (AKDE) of coyotes (Canis latrans) from publicly available movement data shared on Movebank (Mahoney & Young, 2017). These data were collected via Lotek GPS collars (Model GPS3300S; Lotek, Newmarket, ON, Canada) on 8 coyotes in Idaho between 2004-2015. should add hyperlinks in to Movebank repository site and the doi of their paper
We will begin by loading in a subset of their data. To start, we will only use data from ~ 1 months of the study (January 2005) because there are hundreds-of-thousands of data points and no one likes code that takes days to run :’)
Step 1: Load Packages Recall the packages we installed earlier? These are going to come into play now.
library(ctmm)
library(curl)
## Using libcurl 7.64.1 with LibreSSL/2.8.3
Step 2: Load Data Now, lets load in the Movebank GPS data using {curl} and save it to object bigdata
ctmm is designed to go hand in hand with Movebank which is an open access source for GPS animal tracking data. For ctmm to function, data must be in Movebank format and coerced into a telemetry object. Telemetry objects in ctmm contain information about the movement being analyzed including GPS location and sampling schedule. Objects from the {move} package in R can also be utilized in ctmm by using the as.telemetry function. Once the object is in ctmm format, it can be manipulated and visualized to aid in general characterization of the data.
f <- curl("https://raw.githubusercontent.com/vzdanowicz/AN588_ctmm/main/thisisit.csv")
bigdata <- read.csv(f, header = TRUE, sep = ",", stringsAsFactors = FALSE)
# Creating ctmm telemetry object (can be read directly from Movebank)
coyotes <- as.telemetry(bigdata)
## Minimum sampling interval of 4 minutes in IdCoy_P1_1
## Minimum sampling interval of 4 minutes in IdCoy_P1_2
## Minimum sampling interval of 4 minutes in IdCoy_P2_1
## Minimum sampling interval of 4 minutes in IdCoy_P2_3
## Minimum sampling interval of 1 minutes in IdCoy_P2_6
## Minimum sampling interval of 4 minutes in IdCoy_P3_1
## Minimum sampling interval of 3 minutes in IdCoy_P4_2
## Minimum sampling interval of 4 minutes in IdCoy_P5_2
# Changing the names of the coyote IDs to make coding easier!
names(coyotes)<-c('coyA','coyB','coyC','coyD','coyE','coyF','coyG','coyH')
Step 3: Explore!
As with any new data - it is important to visually explore by plotting!
Let’s visualize…
par(mfrow = c(1, 1))
plot(coyotes, col = rainbow(length(coyotes)))
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
title("All Coyotes")
We are going to start by first pulling out all of the movement data for 1 coyote, coyG, to create the object coy1
coy1 <- coyotes$coyG
plot(coy1)
## DOP values missing. Assuming DOP=1.
We can see from the clustering of GPS points that this coyote appears to be range resident with no obvious migrations. The next step of our workflow is to calculate and plot variograms from the GPS data for this individual.
Variograms represent autocorrelated data as the semi-variance between positions with varying time-lags. The structure of the variogram presents important information on the movement of an animal, as well as what method of movement modeling is best suited to the data. Arguably, the most crucial element of a variogram is if the semi-variance reaches an asymptote. This indicates that an animal is range-resident, which allows our range calculation to be meaningful.
Let’s use the code below to calculate and plot the variograms for coy1.
vg.coy1<-variogram(coy1)
#plotting long/short lag variograms
par(mfrow = c(1, 2))
plot(vg.coy1)
title("Long Lag")
plot(vg.coy1,fraction=0.005)
title("Short Lag")
The asymptote of the variogram for
coy1 at longer (day) lags supports our visual analysis that coy1 is range-resident, while the slight upward curve of the short (minutes) lags show evidence of directional persistence in the data.
We can then use either variogram.fit() or ctmm.guess() to generate our initial parameter guesses/estimates. These variables will later be passed to ctmm.select or ctmm.fit to better assess the model.
If run in your console, variogram.fit() allows you to visually assess and adjust parameters via the manipulate gear in top left corner of the plot pane.
NOTE: Though you can save these parameters for later use in model fitting by selecting ‘save to GUESS’ from manipulate, the code below already includes the creation of object
GUESSwithout needing to save thevariogram.fit()parameters from the plot pane.
variogram.fit(vg.coy1)
## An object of class "ctmm"
## [[1]]
## [1] "stationary"
##
## [[2]]
## An object of class "covm"
## x y
## x 344672.6 0.0
## y 0.0 344672.6
## Slot "par":
## major minor angle
## 344672.6 344672.6 0.0
##
## Slot "isotropic":
## [1] FALSE
##
##
## [[3]]
## gps NA
## FALSE
##
## [[4]]
## [1] FALSE
##
## [[5]]
## [1] "x" "y"
##
## [[6]]
## [1] FALSE
##
## [[7]]
## position velocity
## 12047.4182 440.8345
##
## [[8]]
## [1] 0
##
## [[9]]
## [1] TRUE
##
## Slot "info":
## $identity
## [1] "IdCoy_P4_2"
##
## $timezone
## [1] "UTC"
##
## $projection
## [1] "+proj=tpeqd +lat_1=43.7655737679396 +lon_1=-112.693432141095 +lat_2=43.7454794721036 +lon_2=-112.646927688557 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
##
## $axes
## [1] "x" "y"
# Since we are knitting, I don't think the variogram fit will be able to be saved as an object without doing manually...
GUESS<-variogram.fit(vg.coy1)
We can also use ctmm.guess() and the variogram for coy1, vg.coy1, to estimate initial model parameters.
coy1_GUESS <- ctmm.guess(coy1,variogram = vg.coy1,interactive=FALSE)
These parameters, GUESS and coy1_GUESS, are critical in our ability to assess the most suitable model type for the data at hand. We will next explore how ctmm.select and ctmm.fit allow us to test and visualize potential model types.
In order to use autocorrelated kernel density estimation (AKDE) to perform accurate home range analyses, we have to ensure that our data indicates range residency and then choose the best model to represent the data. The first step as discussed above is to use a variogram to visualize whether movement is confined to a given space or home range. Then, a movement model is chosen.
The independent identically distributed (IID) process assumes each data point is independent and random. However, animal movement is automatically correlated since one’s location in space is dependent on continuous movement (i.e. coyotes also can’t teleport). This means the IID model is not a good model for our purposes despite it’s use in older kernel density estimation metrics. The brownian motion (BM) process assumes random movement and diffusion in unlimited space and is good for data not comprehensive enough to demonstrate home range or velocity autocorrelation. Typically, BM processes are not suitable for our purposes. However, the Ohrnstein-Uhlenbeck (OU) process uses BM but assumes fidelity to a general location and is suitable for data which shows signs of residency. What this model lacks is the ability to incorporate autocorrelated velocities. The integrated OU (IOU) process solves this problem for data with high resolution but is unable to demonstrate range residency. All of these shortcomings are solved in the Ohrnstein-Uhlenbeck Foraging (OUF) process which combines OU and IOU processes to enable analysis of data demonstrating both velocity autocorrelation and range residency.
This is especially relevant for improved data sampling methods which are able to combine long sampling periods and high resolution of sampling points. Given the extensive dataset used here, we can expect that our best fit movement model might be the OUF model. We also have to differentiate between the accuracy of isotropic versus anisotropic models for each movement model. By running ctmm.fit and ctmm.select, we can produce data to determine the best fit model. Once we have decided between isotropic and anisotropic, we can proceed to visualize those model types to confirm the results of the initial model guess.
Let’s compare the parameters from variogram.fit and ctmm.guess.
vg.fitted.mods <- ctmm.select(coy1,CTMM = GUESS,verbose = TRUE) #can take 5+ mins to run
summary(vg.fitted.mods)
## ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic 0.00000 158.5202 77.25247
## OUF isotropic 13.87693 150.9085 78.74946
## OUf anisotropic 1410.24731 0.0000 559.37438
## OU anisotropic 2256.52201 180.5625 33.09277
guess.fitted.mods <- ctmm.select(coy1,CTMM = coy1_GUESS,verbose = TRUE) #can take 5+ mins to run
summary(guess.fitted.mods)
## ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic 0.00000 158.5202 77.25251
## OUF isotropic 13.87693 150.9085 78.74945
## OUf anisotropic 1410.24731 0.0000 559.37438
## OU anisotropic 2256.52201 180.5625 33.09278
Both methods yield the same model fittings. Thus - either are acceptable for initial parameter guessing.
We can see from the summary above, that the anisotropic version of OUF is the ideal model. We can also see that the anisotropic versions of each model were favored over their isotropic counterpart (we are looking for lowest AIC values). We can now visually examine the fits of the anisotropic versions of OU and OUF models.
#Extract the fitted anisotropic versions of IID,OU,and OUF.
OU<-guess.fitted.mods[[4]]
OUF<-guess.fitted.mods[[1]]
ctmm.fit accepts a ctmm object as well as the guesses generated by variogram.fit (in our case, OU and OUF) and generates values for comparing the fit of each model, including point estimates, confidence intervals, and AICc. Based on this output, the user can select the most appropriate model for their data. We explain some of these parameters in more detail in the following paragraphs.
takes ~2mins to run the fits below
M.OU <- ctmm.fit(coy1,OU)
M.OUF <- ctmm.fit(coy1,OUF)
FITS <- list(OU=M.OU,OUF=M.OUF)
summary(FITS)
## ΔAICc ΔRMSPE (m) DOF[area]
## OUF 0.000 0.00000 77.25249
## OU 2256.522 22.04227 33.09278
AICc is the (linearly) corrected Akaike information criteria. AIC balances likelihood against model complexity in a way that is good if we want to make optimal predictions. A lower AIC is better, thus the OUF model seems better here.
The fit parameter DOF[mean] is the number of degrees of freedom worth of data we have to estimate the stationary mean parameter, assuming that the model is correct.
par(mfrow = c(2, 2))
plot(vg.coy1, CTMM=OU, col.CTMM = '#1b9e77')
title("OU")
plot(vg.coy1, CTMM=OU, col.CTMM = '#1b9e77', fraction = 0.005)
title("OU")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77')
title("OUF")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF")
We can see from the figures above that OUF is a better fitted model (though not perfect!). The data we are using here are a subset of a much larger body of data. We reduced the number of observations so that the ctmm models would run in a reasonable amount of time, but it is likely that the OUF model would look even better fitted if more observation points were included. Additionally, even though the OUF model is not perfect in the shorter time lag plot, the overall trend over multiple days (longer time lag) is adequate.
par(mfrow = c(1, 2))
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","purple"),fraction=0.65,level=0.5)
title("zoomed out")
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","purple"),fraction=0.05,level=0.5)
title("zoomed in")
We can see the OU (pink) and OUF (purple) models plotted here against the variogram we created earlier. The zoomed in figure shows the purple line, the OUF model, follows the variogram more closely compared to the OU model.
The previous steps revealed the OUF model is best suited for our data. We can now use ctmm.fit to fit our model via maximum likelihood. This function processes a telemetry object and model specification (i.e. OUF, OU, IID, etc.), returning the maximum likelihood parameter and interval estimates. The objects returned by ctmm.fit will allow us to compare the fitted model to the data.
this can take ~2 mins to run
coy1_FIT <- ctmm.fit(coy1, CTMM = OUF, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)
## Maximizing likelihood.
## Calculating Hessian.
## Calculating REML gradient.
## Calculating REML Hessian.
summary(coy1_FIT)
## $name
## [1] "OUF anisotropic"
##
## $DOF
## mean area speed
## 41.39732 77.25250 4258.41374
##
## $CI
## low est high
## area (square kilometers) 5.226104 6.619441 8.174942
## τ[position] (hours) 3.384463 4.390025 5.694352
## τ[velocity] (minutes) 3.704709 3.913170 4.133361
## speed (kilometers/day) 37.089802 37.655334 38.220749
NOTE:
CTMM = OUFbased on our previous figures/calculations which show that is the best fit model compared to the OU model. Additionally, the argumenttrace = TRUEprovides a progress report in the console window asctmm.fitruns.
coy2 <- coyotes$coyH
plot(coy2)
## DOP values missing. Assuming DOP=1.
vg.coy2<-variogram(coy2)
par(mfrow = c(1, 2))
plot(vg.coy2)
plot(vg.coy2,fraction=0.005)
What do the variograms for this coyote suggest? What might the overall picture of the first variogram mean? How about the second variogram which is zoomed in?
variogram.fit(vg.coy2)
## An object of class "ctmm"
## [[1]]
## [1] "stationary"
##
## [[2]]
## An object of class "covm"
## x y
## x 1907134 0
## y 0 1907134
## Slot "par":
## major minor angle
## 1907134 1907134 0
##
## Slot "isotropic":
## [1] FALSE
##
##
## [[3]]
## gps NA
## FALSE
##
## [[4]]
## [1] FALSE
##
## [[5]]
## [1] "x" "y"
##
## [[6]]
## [1] FALSE
##
## [[7]]
## position velocity
## 28270.8795 632.9355
##
## [[8]]
## [1] 0
##
## [[9]]
## [1] TRUE
##
## Slot "info":
## $identity
## [1] "IdCoy_P5_2"
##
## $timezone
## [1] "UTC"
##
## $projection
## [1] "+proj=tpeqd +lat_1=43.7655737679396 +lon_1=-112.693432141095 +lat_2=43.7454794721036 +lon_2=-112.646927688557 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
##
## $axes
## [1] "x" "y"
coy2_GUESS <- ctmm.guess(coy2,variogram = vg.coy2,interactive=FALSE)
coy2_SELECT <- ctmm.select(coy2,CTMM = coy2_GUESS,verbose = TRUE)
summary(coy2_SELECT)
## ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic 0.00000 528.9691 21.595785
## OUF isotropic 18.93576 515.4891 21.850236
## OU anisotropic 2089.00334 626.6339 9.849115
## OUf anisotropic 2254.68758 0.0000 413.798019
OU2<-coy2_SELECT[[4]]
OUF2<-coy2_SELECT[[1]]
M.OU2 <- ctmm.fit(coy2,OU2)
M.OUF2 <- ctmm.fit(coy2,OUF2)
FITS2 <- list(OU=M.OU2,OUF=M.OUF2)
summary(FITS2)
## ΔAICc ΔRMSPE (m) DOF[area]
## OUF 0.000 528.969 21.59579
## OU 2254.688 0.000 413.79802
par(mfrow = c(2, 2))
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77')
title("OU")
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77', fraction = 0.005)
title("OU")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77')
title("OUF")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF")
Which method fits the model best as suggested by the AICc? What do our plots show us about whether OU or OUF best fits the movement data?
# We don't need to fit again. Already fitted OUF previously.
coy2_FIT <- ctmm.fit(coy2, CTMM = M.OUF2, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)
## Maximizing likelihood.
## Calculating Hessian.
## Calculating REML gradient.
## Calculating REML Hessian.
summary(coy2_FIT)
## $name
## [1] "OUF anisotropic"
##
## $DOF
## mean area speed
## 12.36824 21.59579 4280.15535
##
## $CI
## low est high
## area (square kilometers) 23.791749 38.14927 55.841790
## τ[position] (hours) 9.070859 15.76741 27.407693
## τ[velocity] (minutes) 3.471855 3.66519 3.869292
## speed (kilometers/day) 48.562337 49.30089 50.039283
# Is the same as:
summary(M.OUF2)
## $name
## [1] "OUF anisotropic"
##
## $DOF
## mean area speed
## 12.36824 21.59579 4280.15535
##
## $CI
## low est high
## area (square kilometers) 23.791748 38.14927 55.841788
## τ[position] (hours) 9.070858 15.76741 27.407693
## τ[velocity] (minutes) 3.471855 3.66519 3.869292
## speed (kilometers/day) 48.562337 49.30089 50.039283
lets just try to run both at the same time again !!! cries !!!
FS - REJOICE! With both models done, we don’t have to do both again. We can just combine them in a list!
both.coyotes <- coyotes[c("coyG","coyH")]
#variograms
variograms <- list(vg.coy1, vg.coy2)
plot.vg <- lapply(1:2, function(i) plot(variograms[[i]],fraction = 0.5))
GUESS_2 <- list(guess.fitted.mods, coy2_GUESS)
FITS_2 <- list(coy1_FIT, coy2_FIT)
par(mfrow = c(2, 2))
plot(variograms[[1]],CTMM=FITS_2,col.CTMM=c("purple"),fraction=0.65,level=0.5)
plot(variograms[[1]],CTMM=FITS_2,col.CTMM=c("purple"),fraction=0.05,level=0.5)
plot(variograms[[2]],CTMM=FITS_2,col.CTMM=c("blue"),fraction=0.65,level=0.5)
plot(variograms[[2]],CTMM=FITS_2,col.CTMM=c("blue"),fraction=0.05,level=0.5)
names(FITS_2) <- names(both.coyotes[1:2])
AKDE corrects for biases due to autocorrelation, small effective sample size, and irregular sampling in time
UDS_coyotes <- akde(both.coyotes[1:2],FITS_2)
plot(UDS_coyotes[[1]])
summary(UDS_coyotes[[1]])
## $DOF
## area bandwidth
## 77.2525 199.6350
##
## $CI
## low est high
## area (square kilometers) 4.823146 6.10905 7.544614
plot(UDS_coyotes[[2]])
summary(UDS_coyotes[[2]])
## $DOF
## area bandwidth
## 21.59579 36.11150
##
## $CI
## low est high
## area (square kilometers) 21.91205 35.13524 51.42994
This function calculates a useful measure of similarity between distributions evaluate overlap of 95% Home Range.The choice method=“BA” computes the Bhattacharyya’s affinity (BA). Values range from zero (no overlap) to 1 (identical Utilization Distributions).
95% level of confidence
overlap(UDS_coyotes)
## , , low
##
## coyG coyH
## coyG 1.00000000 0.06704109
## coyH 0.06704109 1.00000000
##
## , , est
##
## coyG coyH
## coyG 1.0000000 0.1564097
## coyH 0.1564097 1.0000000
##
## , , high
##
## coyG coyH
## coyG 1.0000000 0.3116932
## coyH 0.3116932 1.0000000
plotting overlap
#use the plot function to look at the AKDE UD
#95% Home Range (95% is the default)
#The middle contour represent the maximum likelihood area where the animal spends 95% of its time.
plot(both.coyotes, UD = UDS_coyotes, col=rainbow(length(both.coyotes)))
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
We can also estimate the CDE (conditional distribution of encounters) (Noonan et al., 2020). CDE is a concept describing the long-term encounter location probabilities for movement within home ranges.
CDE_coyotes <- encounter(UDS_coyotes)
plot(both.coyotes,
col=c("#FF0000", "#F2AD00"),
UD=CDE_coyotes,
col.DF="#046C9A",
col.grid = NA)
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
Literature Cited
C, J. M., Fleming, C. H., & Gurarie, E. (2016). ctmm: an r package for analyzing animal relocation data as a continuous-time stochastic process. Methods in Ecology and Evolution, 7(9), 1124–1132.
Codling, E. A., Plank, M. J., & Benhamou, S. (2008). Random walk models in biology. Journal of The Royal Society Interface, 5(25), 813–834. https://doi.org/10.1098/rsif.2008.0014 Nau, R. (2014, November 4). Notes on the random walk model - people.duke.edu. Notes on the random walk model. Retrieved 2021, from https://people.duke.edu/~rnau/Notes_on_the_random_walk_model--Robert_Nau.pdf